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METHOD, SYSTEM, AND PROGRAM FOR GATHERING 
INDEXABLE METADATA ON CONTENT AT A DATA REPOSITORY 

BACKGROUND OF THE INVENTION 
5 1. Field of the Invention 

The present invention relates to a method, system, and program for gathering 
indexable metadata on content at an electronic data repository, 

2. Description of the Related Art 

10 To locate documents on the Internet, users typically use an Internet search engine. 

Internet users enter one or more key search terms which may include boolean operators 
for the search, and transmit the search request to a server including a search engine. The 
search engine maintains an index of information from web pages on the Internet, This 
index provides search terms for a particular Web address or Universal Resource Locator 

15 (URL), If the index terms for a URL in the search engine database satisfy the Internet 
user search query, than that URL is retumed in response to the query. 

Search engine providers need to constantly update their URL database to provide 
a more accurate and larger universe of potential search results that may be retumed to the 
user. Search engine companies sometimes employ a robot that searches and categorizes 

20 Web pages on the basis of metatags and content in the located HTML pages. A robot is 
a program that automatically traverses the Web's hypertext structure by retrieving an 
HTML page, and then recursively retrieving all documents referenced from the retrieved 
page, Web robots released by search engines to access and index Web pages are referred 
to as Web crawlers and Web spiders, 

25 Search engines having a database of indexable terms for URLs generated by 

robots are quite common and popular. However, some of the noticeable disadvantages of 
such robot generated URL databases is that periodic updates to the URL web site may 
render the URL database inaccurate and outdated until the robot rechecks a previously 
indexed page. Further, search engine robots are currently designed to search for HTML 
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pages and parse HTML content into index search terms in the search engine database. 
However, many web pages provide content in formats that are not accessible or parseable 
to prior art search engine robots that are designed to traverse HTML pages, such as 
content encoded in various multi-media formats, e.g., MPEG, SHOCKWAVE, ZIP files, 
5 etc. Further, web site content may be dynamic and accessible by providing a search term 
that is then used by a program, e.g., the Common Gateway Interface (CGI), Java 
programs, Microsoft Active Server pages, etc., to query a database and return search 
results. Such dynamic data accessible through queries is typically not identified by prior 
art search engine robots and indexed in the search engine URL database. 

10 A still further disadvantage is that Web robots have been known to overload web 

servers and present security hazards. For this reason, many web sites use a firewall that 
restricts the search engine web robot from accessing and cataloging the content, even 
when the web site provider would want their information publicly available, Web site 
providers may also limit a web robot's access to a site by creating a "robottxt" file that 

15 indicates URLs on the site that the robot is not permitted to access and index. Such 
limitations of search engine web robots may prevent the web robot from accessing 
relevant web pages that would be of significant interest to search engine users. 

Some search engines use a manual taxonomist. For instance Yahoo receives a 
manual submission of a web page and then categorizes the web page for inclusion in its 

20 database. This approach may be very time consuming. Further, the manual taxonomical 
approach cannot catalog as many pages as a robot approach that continually traverses the 
Internet, i.e,. World Wide Web, for new pages and that is not limited to content that is 
submitted by users. 

Thus, there is a need in the art for an improved technique for cataloging web 

25 pages. 
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SUMMARY OF THE PREFERRED EMBODIMENTS 
To overcome the limitations in the prior art described above, preferred 
embodiments disclose a method, system, and program for searching a data repository 
managed by a content provider to gather indexable metadata on content at addresses 
locations at the data repository. Settings capable of being customized by the content 
provider are accessed. The customized settings provide instructions on how to search the 
content provider's data repository. The content of content pages at the content provider's ^ 
data repository is accessed in accordance with instructions included in the accessed 
customized settings. Metadata from accessed content pages is generated and added to an 
index of metadata for accessed addressable locations at the data repository. 

In further embodiments, the accessed customizable settings may provide 
addressable locations at the content provider's data repository provided by the content 
provider. In such case, accessing the content pages includes accessing the content pages 
at the provided addressable locations, wherein metadata is generated for the accessed 
content pages. 

Still further, the accessed customizable settings further provide query terms for at 
least one provided addressable location. For each provided addressable location for 
which there are query terms, the provided query terms are used at the provided 
addressable location to obtain query results. Metadata is then generated from the 
obtained query results to add to the index of metadata for accessed addressable locations 
at the data repository. 

In still further embodiments, the accessed customizable settings further indicate 
validation checking programs. Each validation checking program indicated in the 
accessed settings is executed against each accessed content page. A validation output 
result for each accessed content page is generated with each validation checking program 
describing characteristics of the content page. Metadata is generated from the validation 
output result to add to the index of metadata for accessed addressable locations at the data 
repository. 
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Still further, a determination is made of a parser capable of parsing an embedded 
file referenced in the content page. The content of the referenced embedded file is parsed 
and metadata is generated for the parsed content of the embedded file to add to the index. 
Preferred embodiments provide a technique for searching data repositories to 
5 gather indexable metadata for URLs at the data repository that allows the owner of the 
data repository greater control over how metadata is gathered. For instance, preferred 
embodiments allow the content provider to specify URLs to search for indexable 
metadata. Moreover, with preferred embodiments, metadata may be gathered for content 
encoded in formats not accessed by prior art search robots, such as content in multimedia 

10 files such as movie files, Shockwave files, ZIP files, etc. Still fiirther, with preferred 
embodiments, the data repository owner may control what metadata is provided by 
selecting validation checking programs to generate metadata indicating whether the 
content at the URL satisfies certain validation criteria or satisfies selected qualifiers. 

These preferred embodiment techniques are an improvement over the current art 

15 where Web robots search only for content in a text format at the URL and do not access 
embedded files in non-textual encoding, such as multi-media files or other compressed 
files. Further, by allowing the data repository owner to tailor how URLs are searched and 
metadata is gathered, the data repository owner can improve the indexable metadata 
available for the data repository, thereby improving the quality of search results. 

20 Moreover, the preferred embodiment collection tool may be used by a data 

warehouse company that is engaged in the commercial gathering of metadata on URL 
pages to gather highly relevant metadata for URL sites. The data warehouse can provide 
or sell subscriptions to content providers to gather metadata on the subscriber URLs and 
then sell or license metadata to interested parties. 



25 
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BRIEF DESCRIPTION OF THE DRAWINGS 
Referring now to the drawings in which like reference numbers represents 
corresponding parts throughout: 

FIG, 1 illustrates a relationship between a content provider and warehouse 
5 collecting indexable metadata for data repositories in accordance with preferred 
embodiments of the present invention; 

FIG, 2 illustrates program components of a collection tool used to gather 
indexable metadata at URLs in accordance with preferred embodiments of the present 
invention; 

1 0 FIG, 3 illustrates a structure of a file used to control how the collection tool 

searches URLs in accordance with preferred embodiments of the present invention; and 

FIGs. 4, 5, and 6 illustrate logic performed by the collection tool to gather 
indexable metadata from URLs in accordance with preferred embodiments of the present 
invention. 

15 

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS 
In the following description, reference is made to the accompanying drawings 
which form a part hereof, and which illustrate several embodiments of the present 
invention. It is understood that other embodiments may be utilized and structural and 
20 operational changes may be made without departing from the scope of the present 
invention, 

FIG, 1 illustrates a relationship established between a content provider 2 and a 
metadata warehouse 4, The metadata warehouse 4 gathers indexable metadata on URLs 
from different content site owners 2, The metadata warehouse 4 may provide this 
25 information to a search engine provider to use to update its URL database. Alternatively, 
the metadata warehouse 4 may be part of the search engine provider that gathers 
indexable metadata for the search engine database. The content provider 2 wants to make 
information on its web site available to the metadata warehouse 4 in order to allow 
Internet users to locate LfRLs at its site when doing searches. The browsable content of 
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the content provider 2 is maintained in a repository 8. An Internet user could access data 
in the repository 8 by presenting a URL. If the repository includes dynamic data, then the 
Internet user may have to provide parameters along with a CGI command to access the 
dynamic data in the repository 8. The data repository 8 may include HTML pages as well 
5 as non-HTML content accessible over the Internet. The non-HTML content in the data 
repository 8 could include dynamic data, data maintained in file format requiring a plug- 
in application to render or process, such as a movie file, ShockWave file, or any other 
multi-media format. 

The metadata warehouse 4 provides a collection tool 6, described in detail below, 

10 that would search the repository 8 for information on the content at the URLs in the 
repository 8 and generate metadata 10 based on the content at accessed URLs in the 
repository 8. This metadata 10 would then be provided to the warehouse 4, preferably 
over a network connection such as the Internet. The collection tool 6 may be executed 
by a warehouse 4 server to search the repository 8 firom an external location. 

15 Alternatively, the content provider 2 may execute the collection tool 6 on a content 

provider 2 computer capable of accessing the repository 8 to execute and gather metadata 
10 to transmit back to the warehouse 4 server. 

FIG, 1 illustrates the presence of an access barrier 12 which, would prevent prior 
art web robots fi:om collecting indexable metadata on the URLs in the content provider's 

20 2 repository 8, This access barrier 12 may comprise an encoding format, e.g., MPEG (or 
any other multimedia file format), Shockwave, ZIP files, CGI, Extended Markup File 
(XML), a Java program, etc., that is not accessible to prior art robots that typically are 
only capable of parsing and gathering information on HTML web pages. Moreover, the 
access barrier 12 may comprise a firewall that prevents robots fi"om traversing the 

25 repository 8 URLs. 

FIG, 2 illustrates program components within the collection tool 6. The collection 
tool 6 includes robot type fimctions known in the art for traversing web pages and is 
designed to gather indexable metadata from content at URLs in the repository 8, The 
collection tool 6 determmes its modus operandi of searching based on a structured 
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document referred to as a search instruction file 20, The search instruction file 20 may be 
stored in the content providers repository 8 at a URL that the collection tool 6 is 
programmed to access. Ahematively, when the warehouse 4 provides the content 
provider 2 the collection tool 6, the search instruction file 20 may be included in the 
5 collection tool 6 installation package and installed with the collection tool 6 files. 

The collection tool 6 would further include parsers 22a, b, .„ n that are capable of 
parsing and gathering data firom different content encodings, such as HTML, XML, 
Shockwave, MPEG, JPEG, ZIP files or any other multimedia content format or 
compression type. These parsers 22a, b,..n allow the collection tool 6 to gather indexable 
1 0 metadata firom numerous type of content types available at web pages. In this way, in 
preferred embodiments, the indexable metadata 10 is not limited to content encoded in 
HTML or other text formats, but may include information on content encoded in other 
formats, such as multimedia file formats. 

The collection tool 6 further includes validation checkers 24a, b,„n that are 
15 programs that process the page at the URL to determine whether the pages satisfies 
certain predetermined conditions. The output of such validation checkers 24a, b,...n 
indicate the analyzed Web pagers conformance to certain standards and conditions. This 
validation checker output may be added to the metadata for the Web page. For instance, 
the validation checker 24a, b, ,..n may comprise anyone of the following validation 
20 checker programs: 

World Wide Web Consortium (W30 HTML Validation Service: checks HTML 
pages to determine their conformance to W3C HTML and XHTML 
recommendations and standards, as well as XML well-formedness. 
XML Wellformedness Checker and VaUdator : checks an XML document for 
25 well-formedness and (optionally) validity. 

BOBBY: is a web-based tool that analyzes web pages for their accessibility to 
people with disabilities, BOBBY*s analysis of accessibility is based on the W3C 
Web Content Accessibility Guidelines. For example, to become Bobby approved, 
a Web site must: provide text equivalents for all non-text elements (i.e., images. 
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animations, audio, video); provide summaries of graphs and charts; ensure that all 
information conveyed with color is also available without color; clearly identify 
changes in the natural language of a documents text and any text equivalents (e.g,, 
captions) of non-text content; organize content logically; and clearly provide 
5 alternative content for features (e.g., applets or plug-ins) that may not be 

supported 

Securitv : A validation checker 24a, b, „.n may determine any security settings or 
security guarantees and return indexable information on such security settmgs. 
Privacy: A vaUdation checker 24a, b, ,..n may determine any privacy guarantee or 
10 any branded logos or seals of approval conceming the privacy conditions at the 

URL, such as the TRUSTe seal of approval, which indicates the approval and 
participation in the TRUSTe privacy program. 

Ratings or Awards : A validation checker 24a, b, ...n may determine whether the 
site has any specified ratings or awards. In preferred embodiments, the content 
1 5 provider 2 may specify to the warehouse 4 any particular ratings or awards. 

"Best Viewed Bv" A validation checker 24a, b, ...n may determine whether the 
web page at the URL indicates a preferred web browser to use for viewing the 
Web page. 

Endorsements and Approvals : A validation checker 24a, b, ,..n may determine 
20 whether the site has any particular endorsements or approvals. In preferred 

embodiments, the content provider 2 may specify to the warehouse 4 any 
particular endorsements and approvals to check. For instance, the content 
provider 2 may request that the collection tool 6 check the page for endorsement 
from a particular religious organization to ensure a certain level of approval, 
25 Warning : A validation checker 24a, b, ...n may determme whether the site has 

any warnings, such as "No one under 18", "Aduh Content", etc. In preferred 
embodiments, the content provider 2 may specify to the warehouse 4 any 
particular warnings to check. 
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The validation checkers 24a, b,„.n may check for any other characteristics of a 
web page or the content therein. Moreover, the validation checkers 24a, b, „.n may 
include parser programs for different content encodings to review characteristics of non- 
HTML pages. For instance, a validation checker 24a, b, ... n may parse a movie file, e.g., 
5 MPEG or an Apple QuickTime file, to determine whether there is any content warning, 
e.g., Adult Content, Graphic Violence, "R" rated, "XXX" rated, etc. 

The validation checkers 24a, b...n may utilize the parsers 22a, b,.,n to examine the 
content file or, alternatively, the validation checkers 24a, b,.,.n may include their own 
parser to examine the page content, 

10 FIG, 3 illustrates components within the search instruction file 20 that instruct the 

collection tool 6 on how to process the URLs in the repository 8. In preferred 
embodiments, the content provider 2 would configure the search instruction file 20 to 
control how the collection tool 6 searches the repository 8. A search URLs 50 
components indicates URLs that the collection tool 6 should access and search according 

15 to the other components. The query terms component 52 provides a list of search terms 
for a specified URL for the collection tool 6 to apply against that URL to access dynamic 
data through the web page at the URL. The repository 8 may include CGI or Java 
programs to access and generate dynamic data in response to the query terms. The 
collection tool 6 may construct a URL including a search term to present to the page 

20 when performing queries at the URL page, A query term qualifiers 54 provides 

predicates to apply against the query results for one or more query terms to determine if 
the query result satisfies the qualifier predicate. 

A passwords component 56 mdicates passwords for the collection tool 6 to use at 
a particular URL to access data at that URL. The password(s) would allow the collection 

25 tool 6 to proceed to a protected site or search for pages behind a firewall. The recursive 
search settings component 58 indicates how the collection tool 6 is to search URL links at 
the accessed URL listed in the search URLs 50, For instance, the setting 56 may instruct 
the collection tool 6 to recursively search all links at a URL. Alternatively, the setting 56 
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may instruct not to search any links beyond the URL, or limit the depth or extent of 
recursive searching. 

The prohibited URLs component 60 indicates URLs that the collection tool 6 is 
prohibited from accessing. This component 60 is used to prevent the collection tool 6 
5 from accessing a particular URL that would be recursively accessed through a link at a 
URL listed in the search URLs 50. The validation checkers component 62 indicates one 
or more validation checkers 24a, b, „, n that the collection tool 6 should execute against 
an accessed page. The check parameters component 64 indicates specific characteristics 
the validation checker 24a, b, ... n should check for on the examined Web page. For 

10 instance, the parameters 64 may indicate seals of approval, endorsements, warnings, 
ratings, etc., that the validation checker 24a, b.„n should search upon. If no check 
parameters are provided for a validation checker 60, then the validation checker 60 does 
not require any user parameters to perform its checking, such as the case with BOBBY or 
XML well formedness checkers that determine whether a page complies with certain 

1 5 predetermined parameters. 

VaUdation checker qualifiers 66 provide qualifiers to ^ply against the output 
from a validation checker 24a, b, ... n checking the content at the URL. If the output of 
the validation checker 24a, b, n for the URL does not satisfy the qualifier predicate, 
then no metadata would be returned for that URL and that non-qualifying LTRL would not 

20 be indexed for inclusion with the metadata provided to the warehouse 4. For instance, 
the qualifier may indicate to not return search results of pages that are not BOBBY 
compliant. If the BOBBY validation checker 24a, b, ... n determined that the content at 
the URL was not BOBBY compliant, then according to the BOBBY qualifier, metadata 
for that URL would not be returned and that URL would not be indexed. Alternatively, 

25 metadata can be returned that indicates the non-compliance of the checked URL page, or 
some rating indicating a non-compliance. Such metadata indicating non-compliance 
could allow Intemet searchers who are interested to locate non-compliant pages. There 
may be a separate validation checker qualifier 66 for each validation checker 62 selected 
in the search instruction file 20, If there is no validation checker qualifier 66, then 
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metadata for that page is returned regardless of the output from the validation checker 
24a, b, n. If a qualifier for the output from a validation checker 24a, b,... n is not 
satisfied, then the collection tool 6 may either not provide any data for that non-qualifying 
URL, Alternatively, the collection tool 6 may not include the non-qualifying output with 
5 the metadata for the URL, but allow other qualifying metadata for the URL to be 

provided to the warehouse 4, In this way, the content provider 2 can control the type of 
metadata provided to tailor how the URL will be returned in response to Internet user 
search queries. 

A page attribute qualifier component 68 indicates attributes to use to qualify a 

10 page, such as the date of the content, byte size, etc. 

The warehouse 4 may develop a graphical user interface (GUI) to allow the 
content provider 2 to set search settings in the search instruction file 20. For instance, if 
the content provider 2 communicates with the warehouse 4 over the Internet, then the 
content provider 2 may use a Web browser GUI to select search settings in the search 

15 instruction file 20 that will control how the collection tool 6 searches the repository 8. 
The GUI could also allow the content provider 2 to select validation checkers 24a, b, ... n 
to use and then select quaUfiers 66 for the validation checkers 24a, b,,.,n. This allows the 
content provider 2 to customize how the collection tool 6 will search its repository 8. 
FIG, 4 illustrates program logic implemented in the collection tool 6 to perform 

20 the search operation on one or more URL pages the content provider 2 submits to the 
warehouse 4. Control begins at block 100 with the collection tool being executed. The 
content provider 2 may run the collection tool 6 internally within a firewall including the 
repository 8, Ahematively, the warehouse 4 may run the collection tool 6 from a server 
external to the firewall including the repository 8, The content provider 2 and 

25 warehouse 4 may agree upon a regular schedule for running the collection tool 6 to ensure 
that the warehouse URL database is regularly updated with indexable metadata for the 
repository 8 URLs. Further, the content provider 2 may determine when to run the 
collection tool 6, such as after any updates or modifications to the content in the 
repository 8. 
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Once executed, the collection tool 6 accesses the search instruction file 20, which 
may be maintained in a directory at the content provider 2 computer running the 
collection tool 6, the repository 8, or the warehouse 4 server. Between block 104 and 
160, the collection tool 6 performs the steps at block 106-148 for each URL page in the 
5 search URL list 50, If (at block 106) there is a password provided for the URL page 
being considered, then the collection tool uses (at block 108) that password when 
accessing the URL page; otherwise, the collection tool 6 accesses (at block 110) the URL 
page. If (at block 112) there are query terms provided for that URL page in the query 
terms list 52, then the collection tool 6 submits (at block 1 14) each provided query term 

10 to the URL page to obtain dynamic data for that search term. After receiving (at block 
1 16) the query results, the collection tool 6 determmes (at block 1 18) whether there are 
any qualifiers in the query term qualifiers components 54 for all the query terms or 
particular query terms. Thus, certain qualifiers may apply to the query results for all 
query terms or the query results for specific query terms. If there are such qualifiers, then 

15 the collection tool 6 performs (at block 120) the non-qualifying action with respect to 

those query results that do not satisfy the qualifiers and a qualifying action for those query 
results that satisfy the qualifiers. The non-qualifying action may comprise not including 
the non-qualifying query results in the metadata for the URL page and/or including 
information that the query results did not qualify. The qualifying action may comprise 

20 including the qualifying query results in the metadata for the URL page and/or including 
information that the query results quaUfied. If there are no qualifiers for the search 
terms, then the collection tool 6 would append (at block 122) information on the query 
results to the metadata for the URL page. 

From block 120 or 122 control proceeds to block 124 in FIG, 5. At block 124, the 

25 collection tool 6 determines (at block 124) whether the validation checkers component 62 
indicates that validation checkers 24a, b,.„n are selected to run. If so, then the collection 
tool 6 runs (at block 126) each selected validation checker 24a, b,,„n against the URL 
page, using any check parameters 64 provided for the validation checker 24a, b,,.,n. If (at 
block 128) there are any validation checker qualifiers 66 for a particular validation 
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checker 24a, b, n, then for those validation checkers that have qualifiers, the collection 
tool 6 determines (at block 130) whether the validation checker output satisfies the 
qualifier(s). If the qualifiers are satisfied (from the yes branch of 130) or if there are no 
validation checker qualifiers (the no branch of block 128), then the collection tool 6 
5 appends (at block 132) information on the validation checker output in the metadata for 
the URL page. If the validation checker output does not satisfy the qualifiers, then the 
collection tool 6 performs (at block 140) the non-qualifying action, which may comprise 
not including any information on the search URL page in the metadata and proceeding to 
block 160 to consider the next URL page in the search URLs 50. Alternatively, the non- 
10 qualifying action may involve the collection tool 6 including specific information in the 
metadata for the URL page that the qualifier was not satisfied and then proceeding to 
block 134 to gather further metadata for the URL page as instructed in the search 
instruction file 20. 

After considering any selected validation checkers in the validation checkers 
15 component 62 and if the validation checker output satisfied any qualifiers, then control 
proceeds from blocks 124 or 132 to block 134. If (at block 134) there are any page 
attribute qualifiers 68, then the collection tool 6 determines (at block 136) whether the 
URL page satisfies the attribute qualifiers. If so, then the collection tool appends (at 
block 138) information on the satisfied attributes in the metadata for the URL page. 
20 Otherwise, if the page attributes are not satisfied, then the collection tool 6 performs (at 
block 140) the non-qualifying action, which may comprise not including any information 
on the search URL page in the metadata and proceeding to block 160 to consider the next 
URL page in the search URLs 50. Alternatively, the non-qualifying action may mvolve 
the collection tool 6 including specific information in the metadata for the URL page that 
25 the qualifier was not satisfied and then proceeding to block 142 to gather further metadata 
for the URL page as instructed in the search instruction file 20, 

From the no branch of block 134 or block 138, control proceeds to block 142 
where the collection tool 6 determines the format of the URL page, e.g., XML, HTML, 
DHTML and selects (at block 144) a parser 22a, b,... n capable of parsing the URL page 
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format The collection tool 6 then uses the selected parser 22a, b,..,n5 i,e,, calls the parser, 
to parse (at block 146) the content to generate metadata for the URL page. The parser 
22a, b,..,n may generate indexable metadata from the URL page in a maoner known in the 
art. For instance, the parser 24a, b,,„n may index the full body of visible text, but may 
5 exclude commonly used words, e.g., "the", "and", etc, may index keywords included in a 
special keyword metatag in the document, and/or may perform word stemming to include 
variations of a word, e.g., politics, politician, political, etc., in the indexable metadata. 

If (at block 148) the URL page includes a reference to an embedded file, such as 
embedded images, Shockwave files, ZIP files, or any other encoded file, then for each 

10 reference to an embedded file, the collection tool 6 selects (at block 150) a parser 22a, 
b,.„n capable of parsing the embedded file. An embedded file is distinguished from a file 
referenced through a hypertext link, such as a hypertext link created using the HTML 
"HREF" statement. A linked file is an HTML page the browser accesses. An embedded 
file is typically referenced as an Applet or using an object tag that specifies a plug-in 

15 application to use to open and render the embedded file, or make the content of the 
embedded file available for processing. The collection tool 6 uses (at block 152) the 
selected parser to parse each embedded file and generate indexable metadata from the 
embedded file content to append to the metadata for the URL page. Either the parser or 
collection tool 6 would use the plug-in appUcation indicated in the object tag for the 

20 embedded file to process or render the content of the embedded file. From the no branch 
of block 148 or block 152, the collection tool determines (at block 154) whether the URL 
page includes links to other URL pages. If links are included and the recursive search 
settings 58 do not restrict (at block 156) recursive searching, then the collection tool 6 
performs (at block 158) steps 106 to 158 for each linked URL page that is not listed as a 

25 prohibited URL 60. From the no branch of block 154, the yes branch of block 156 or 

from block 158, control returns (at block 160) to consider the next URL page listed in the 
search URLs 50. 

Preferred embodiments provide a URL metadata collection tool 6 for searching 
URLs at a content provider's site that is an improvement over current search robots and 
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agents. With the preferred embodiments, the content provider 2 can control when 
searches of its repository 8 will occur to ensure that indexable metadata for the URLs in 
the repository 8 are current. Further, by allowing the content provider 2 to control the 
scheduling of searches, the content provider 2 can avoid the situation in the current art 
5 where search agents or robots may overload the content provider's 2 server. For instance, 
the content provider 2 can schedule the collection tool 6 searches in off-peak hours. 

Moreover, with preferred embodiments, the content provider 2 can specify 
particular URLs for the collection tool 6 to search and gather indexable metadata. Still 
further, the collection tool 6 is capable of gathering indexable metadata for multi-media 

10 content and content encodings that are currently not indexed by search engines, such as 
Shockwave files, ZIP files, and other non-text multimedia files. Moreover, with the 
preferred embodiments, the content provider 2 can control whether information is 
indexed by validation checking pages and setting qualifiers that may exclude metadata 
from pages that do not satisfy certain criteria. In this way, preferred embodiments 

15 provide an improved and more controllable tool for gathering indexable metadata on 

URL pages at a content provider 2 web site and allowing the content provider 2 to control 
how indexable metadata is gathered. 

Once the warehouse 4 gathers the metadata 10 from different content providers 2, 
then the warehouse 4 can provide the indexable metadata to search engine companies to 

20 include in their search engine databases or to other interested parties. Moreover, a data 
warehouse may use the collection tool as part of a business model to enroll subscribers 
who want to control how metadata is gathered about their URLs. The data warehouse 
can use the collection tool to develop a database of metadata for URLs for commercial 
exploitation. Further, the data warehouse could have advertisements displayed when the 

25 collection tool program is executed or configured to generate revenue when content 
providers use the collection tool. 
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Alternative Embodiments and Conclusions 
The following describes some alternative embodiments for accomplishing the 
present invention. 

The preferred embodiments may be implemented as a method, apparatus or 
5 program using standard programming and/or engineering techniques to produce software, 
firmware, hardware, or any combination thereof The programs defining the functions of 
the preferred embodiment can be delivered to a computer via a variety of information 
bearing media, which include, but are not limited to, computer-readable devices, carriers, 
or media, such as a magnetic storage media, "floppy disk," CD-ROM, a file server 

10 providing access to the programs via a network transmission line, wireless transmission 
media, signals propagating through space, radio waves, infl*ared signals, etc. Of course, 
those skilled in the art will recognize that many modifications may be made to this 
configuration without departing fi-om the scope of the present invention. Such 
information bearing media, when carrying computer-readable instructions that direct the 

15 functions of the present invention, represent alternative embodiments of the present 
invention. 

Preferred embodiments provide specific program components included in the 
collection tool to use to provide additional metadata indexing capabilities. In further 
embodiments, the collection tool may examine pages for criteria other than those 
20 described herein. 

Preferred embodiments described particular settings that the content provider may 
configure in the search instruction file 20. In further embodiments, the content provider 2 
may configure different types of settings than those described herein to provide additional 
levels of control over how the collection tool 6 searches Web pages and the metadata 
25 returned. 

The preferred logic of FIGs. 4-6 describes specific operations occurring in a 
particular order. In alternative embodiments, certain of the logic operations may be 
performed in a different order, modified or removed and still implement preferred 
embodiments of the present invention. Morever, steps may be added to the above 
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described logic and still conform to the preferred embodiments. Further, operations 
described herein may occur sequentially or certain operations may be processed in 
parallel. 

In summary, the preferred embodiments provide a method, system, and program 
5 for searching a web site managed by a content provider to gather indexable metadata on 
content at addresses locations at the web site. Settings capable of being customized by 
the content provider are accessed. The customized settings provide instructions on how 
to search the content provider's web site. The accessed customized settings are used to 
control the processing of content pages at the content provider's web site. Metadata from 

10 accessed content pages is generated and added to an index of metadata for accessed 
addressable locations at the web site. 

The foregoing description of the preferred embodiments of the invention has been 
presented for the purposes of illustration and description. It is not intended to be 
exhaustive or to limit the invention to the precise form disclosed. Many modifications 

15 and variations are possible in light of the above teaching. It is intended that the scope of 
the invention be limited not by this detailed description, but rather by the claims 
appended hereto. The above specification, examples and data provide a complete 
description of the manufacture and use of the composition of the invention. Since many 
embodiments of the invention can be made without departing from the spirit and scope of 

20 the invention, the invention resides in the claims hereinafter appended. 
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WHAT IS CLAIMED 

1 LA method for searching a data repository managed by a content provider 

2 to gather indexable metadata on content at addresses locations at the data repository, 

3 comprising: 

4 accessing settings capable of being customized by the content provider, wherein 

5 the customized settings provide instructions on how to search the content provider's data 

6 repository; 

7 accessing content pages at the content provider*s data repository; 

8 accessing the content of content pages at the content provider's data repository in 

9 accordance with instructions included in the accessed customized settings; and 

1 0 generating metadata from accessed content pages to add to an index of metadata 

1 1 for accessed addressable locations at the data repository. 

1 2. The method of claim 1, wherein the customized settings include 

2 parameters and access methods unique to an arrangement of content in the content 

3 provider's data repository. 



1 3 . The method of claim 1 , wherein the accessed customizable settings 

2 provide addressable locations at the content provider's data repository provided by the 

3 content provider, wherein accessing the content pages includes accessing the content 

4 pages at the provided addressable locations, wherein metadata is generated for the 

5 accessed content pages, 

1 4, The method of claim 3, wherein the addressable locations comprise 

2 uniform resource locator (URL) addresses, 

1 5. The method of claim 3, wherein the accessed customizable settings 

2 provide query terms for at least one included addressable location, further comprising: 
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3 for each provided addressable location for which there are query temas, using the 

4 provided query terras at the provided addressable location to obtain query results; and 

5 generating metadata from the obtained query results to add to the index of 

6 metadata for accessed addressable locations at the data repository. 



1 6, The method of claim 5, wherein the accessed customizable settings further 

2 provide qualifiers for at least one search term, further comprising: 

3 for each query term havmg at least one qualifier, determining whether the query 

4 results for the query term satisfy each qualifier for the query term, wherein the metadata 

5 for the query result is generated if the query result satisfies each qualifier for the query 

6 term that generated the query result; and 

7 performing a non-qualifying action for each query result that does not satisfy each 

8 qualifier. 



1 7, The method of claim 6, wherein the non-qualifying action comprises not 

2 including metadata for the query result in the index. 



1 8, The method of claim 3, wherein the accessed customizable settings further 

2 provide a password for at least one provided addressable location, further comprising: 

3 using the provided password to access the content page at the indicated 

4 addressable location for which the password is provided, 

1 9. The method of claim 1 , wherein the accessed customizable settings further 

2 indicate a recursive search setting indicating whether to search hypertext links to linked 

3 addressable locations included in the accessed content page, further comprising: 

4 accessing a content page at each linked addressable location included if the 

5 recursive search setting indicates to recursively search linked addressable locations, 

6 wherein metadata is generated for each content page recursively accessed at the linked 

7 addressable locations in the accessed content page. 
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1 10, The method of claim 9, wherein the accessed customizable settings further 

2 provide prohibited addressable locations at the data repository, wherein metadata is not 

3 generated for each content page at a linked addressable location that is one indicated 

4 prohibited address location. 

1 11. The method of claim 1, wherein the accessed customizable settings further 

2 indicate validation checking programs, further comprising: 

3 executing each validation checking program indicated in the accessed 

4 customizable settings against each accessed content page; 

5 generating a validation output result with the validation checking program for 

6 each accessed content page with each validation checking program describing 

7 characteristics of the content page; 

8 generating metadata from the validation output result to add to the index of 

9 metadata for accessed addressable locations at the data repository. 

1 12, The method of claim 1 1, wherein the accessed customizable settings 

2 further indicate at least one parameter to use with at least one validation checking 

3 program, further comprising: 

4 using the at least one parameter when executing the validation checking program, 

5 wherein the validation output result further indicates characteristics of the content page 

6 related to the at least one parameter used with the validation checking program. 

1 13. The method of claim 1 1 , wherein the accessed customizable settings 

2 further indicate at least one quaUfier to use with at least one vahdation checking program, 

3 further comprising: 

4 determining whether the validation output result satisfies the at least one qualifier 

5 provided with the validation checking program producing the output result, wherein 

6 metadata for the output result is included in the index if the ou^ut result satisfies the 

7 qualifier. 
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1 14. The method of claim 13, wherein metadata for the content page at the 

2 addressable location is not included in the index if the validation output result does not 

3 satisfy the qualifier, 

1 15. The method of claim 1, further comprising: 

2 determining a format of the accessed content page; 

3 selecting one of a plurality parsers capable of parsing the determined format; and 

4 parsing the content page using the selected parser, wherein the metadata to add to 

5 the index is generated from the parsed content page. 

1 16, The method of claim 1 , further comprising: 

2 determining a parser capable of parsing an embedded file referenced in the content 

3 page; 

4 parsing the content of the referenced embedded file; and 

5 generating metadata for the parsed content of the embedded file to add to the 

6 index. 

1 1 7. The method of claim 14, wherein the embedded file is encoded in a 

2 multimedia format. 

1 18. The method of claim 1 , further comprising: 

2 distributing a collection tool to content providers capable of accessing and 

3 generating metadata for content provider data repositories using the accessed 

4 customizable settings; and 

5 collecting metadata data gathered from multiple content providers using the 

6 collection tool to gather metadata on their data repositories; 

1 19. The method of claim 18, further comprising commercializing the collected 

2 metadata. 
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1 20. The method of claim 1 8, further comprising: 

2 receiving an electronic subscription from content providers to use the collection 

3 tool and provide metadata. 

1 2 1 , A system for searching a data repository managed by a content provider to 

2 gather indexable metadata on content at addresses locations at the data repository, 

3 comprising: 

4 means for accessing settings capable of being customized by the content provider, 

5 wherein the customized settings provide instructions on how to search the content 

6 providers data repository; 

7 means for accessing content pages at the content provider's data repository; 

8 means for accessing the content of content pages at the content provider's data 

9 repository in accordance with instructions included in the accessed customized settings; 

10 and 

1 1 means for generating metadata from accessed content pages to add to an index of 

12 metadata for accessed addressable locations at the data repository, 

1 22, The system of claim 2 1 , wherein the customized settings include 

2 parameters and access methods unique to an arrangement of content in the content 

3 provider's data repository, 

1 23, The system of claim 22, wherein the accessed customizable settings 



2 provide addressable locations at the content provider's data repository provided by the 

3 content provider, wherein the means for accessing the content pages includes accessing 

4 the content pages at the provided addressable locations, wherein metadata is generated for 

5 the accessed content pages, 

1 24, The system of claim 23, wherein the addressable locations comprise 

2 uniform resource locator (URL) addresses. 
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1 25. The system of claim 23, wherein the accessed customizable settings 

2 provide query terms for at least one included addressable location, further comprising: 

3 means for using the provided query terms at the provided addressable location to 

4 obtain query results for each provided addressable location for vi^hich there are query 

5 terms; and 

6 means for generating metadata from the obtained query results to add to the index 

7 of metadata for accessed addressable locations at the data repository. 

1 26. The system of claim 25, wherein the accessed customizable settings 

2 further provide qualifiers for at least one search term, further comprising: 

3 means for determining whether the query results for the query term satisfy each 

4 qualifier for the query term for each query term having at least one qualifier, wherein the 

5 metadata for the query result is generated if the query result satisfies each qualifier for the 

6 query term that generated the query result; and 

7 means for performing a non-qualifying action for each query result that does not 

8 satisfy each qualifier. 

1 27. The system of claim 26, wherein the non-qualifying action comprises not 

2 including metadata for the query result in the index. 

1 28. The system of claim 22, wherein the accessed customizable settings 

2 further provide a password for at least one provided addressable location, further 

3 comprising: 

4 means for using the provided password to access the content page at the indicated 

5 addressable location for which the password is provided. 

1 29. The system of claim 23, wherein the accessed customizable settings 

2 further indicate a recursive search setting indicating whether to search hypertext hnks to 

3 linked addressable locations included in the accessed content page, &rther comprising: 
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4 means for accessing a content page at each linked addressable location included if 

5 the recursive search setting indicates to recursively search linked addressable locations, 

6 wherein metadata is generated for each content page recursively accessed at the linked 

7 addressable locations in the accessed content page, 

1 30. The system of claim 29, wherein the accessed customizable settings 

2 further provide prohibited addressable locations at the data repository, wherein metadata 

3 is not generated for each content page at a linked addressable location that is one 

4 indicated prohibited address location, 

1 31. The system of claim 2 1 , wherein the accessed customizable settings 

2 further indicate validation checking programs, further comprising: 

3 means for executing each validation checking program indicated in the accessed 

4 customizable settings against each accessed content page; 

5 means for generating a validation output result with the validation checking 

6 program for each accessed content page with each validation checking program 

7 describing characteristics of the content page; and 

8 means for generating metadata from the validation output result to add to the 

9 index of metadata for accessed addressable locations at the data repository. 

1 32, The system of claim 3 1 , wherein the accessed customizable settings 

2 further indicate at least one parameter to use with at least one validation checking 

3 program, further comprising: 

4 means for using the at least one parameter when executing the vaUdation checking 

5 program, wherein the validation output result further indicates characteristics of the 

6 content page related to the at least one parameter used with the validation checking 

7 program. 
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1 33. The system of claim 31, wherein the accessed customizable settings 

2 further indicate at least one qualifier to use with at least one validation checking program, 

3 further comprising: 

4 means for determining whether the validation output result satisfies the at least 

5 one qualifier provided with the validation checking program producing the output result, 

6 wherein metadata for the output result is included in the index if the output result satisfies 

7 the qualifier 

1 34, The system of claim 33, wherein metadata for the content page at the 

2 addressable location is not included in the index if the validation output result does not 

3 satisfy the qualifier. 

1 35. The system of claim 21, further comprising: 

2 means for determining a format of the accessed content page; 

3 means for selecting one of a plurality parsers capable of parsing the determined 

4 format; and 

5 means for parsing the content page using the selected parser, wherein the metadata 

6 to add to the index is generated firom the parsed content page. 

1 36. The system of claim 21, further comprising: 

2 means for determining a parser capable of parsing an embedded file referenced in 

3 the content page; 

4 means for parsing the content of the referenced embedded file; and 

5 means for generating metadata for the parsed content of the embedded file to add 

6 to the index. 



1 37. The system of claim 36, wherein the embedded file is encoded in a 

2 multimedia format. 
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1 38. The system of claim 22, further comprising: 

2 means for distributing a collection tool to content providers capable of accessing 

3 and generating metadata for content provider data repositories using the accessed 

4 customizable settings; and 

5 means for collecting metadata data gathered from multiple content providers using 

6 the collection tool to gather metadata on their data repositories; 

1 39. A program for searching a data repository managed by a content provider 

2 to gather indexable metadata on content at addresses locations at the data repository, 

3 wherein the program comprises code implemented in a computer readable medium 

4 capable of causing a computer to perform: 

5 accessing settings capable of being customized by the content provider, wherein 

6 the customized settings provide instructions on how to search the content provider's data 

7 repository; 

8 accessing content pages at the content provider's data repository; 

9 accessing the content of content pages at the content provider's data repository in 

10 accordance with instructions included in the accessed customized settings; and 

1 1 generating metadata from accessed content pages to add to an index of metadata 

12 for accessed addressable locations at the data repository. 

1 40. The method of claim 39, wherein the customized settings include 

2 parameters and access methods unique to an arrangement of content in the content 

3 provider's data repository, 

1 41. The program of claim 39, wherein the accessed customizable settings 

2 provide addressable locations at the content provider's data repository provided by the 

3 content provider, wherein accessing the content pages includes accessing the content 

4 pages at the provided addressable locations, wherein metadata is generated for the 

5 accessed content pages. 
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1 42. The program of claim 41, wherein the addressable locations comprise 

2 uniform resoxirce locator (URL) addresses, 

1 43. The program of claim 39, wherein the accessed customizable settings 

2 provide query terms for at least one included addressable location, wherein the program is 

3 further capable of causing the computer to perform: 

4 for each provided addressable location for which there are query terms, using the 

5 provided query terms at the provided addressable location to obtain query results; and 

6 generating metadata from the obtained query results to add to the index of 

7 metadata for accessed addressable locations at the data repository, 

1 44, The program of claim 43, wherein the accessed customizable settings 

2 further provide qualifiers for at least one search term, wherein the program is further 

3 capable of causing the computer to perform: 

4 for each query term having at least one qualifier, determining whether the query 

5 results for the query term satisfy each qualifier for the query term, wherein the metadata 

6 for the query result is generated if the query result satisfies each qualifier for the query 

7 term that generated the query result; and 

8 performing a non-qualifying action for each query result that does not satisfy each 

9 qualifier, 

1 45, The program of claim 44, wherein the non-qualifying action comprises not 

2 including metadata for the query result in the index, 

1 46 The program of claim 39, wherein the accessed customizable settings 

2 further provide a password for at least one provided addressable location, wherein the 

3 program is further capable of causing the computer to perform: 

4 using the provided password to access the content page at the indicated 

5 addressable location for which the password is provided. 
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1 47, The program of claim 39, wherein the accessed customizable settings 

2 further indicate a recursive search setting indicating whether to search hypertext links to 

3 linked addressable locations included in the accessed content page, wherein the program 

4 is further capable of causing the computer to perform: 

5 accessing a content page at each linked addressable location included if the 

6 recursive search setting indicates to recursively search linked addressable locations, 

7 wherein metadata is generated for each content page recursively accessed at the linked 

8 addressable locations in the accessed content page. 

1 48, The program of claim 47, wherein the accessed customizable settings 

2 further provide prohibited addressable locations at the data repository, wherein metadata 

3 is not generated for each content page at a linked addressable location that is one 

4 indicated prohibited address location. 

1 49. The program of claim 39, wherein the accessed customizable settings 

2 further indicate validation checking programs, wherein the program is further enable of 

3 causing the computer to perform: 

4 executing each validation checking program indicated in the accessed 

5 customizable settings against each accessed content page; 

6 generating a validation output result with the validation checking program for 

7 each accessed content page with each validation checking program describing 

8 characteristics of the content page; 

9 generating metadata from the validation output result to add to the index of 
10 metadata for accessed addressable locations at the data repository. 

1 50. The program of claim 49, wherein the accessed customizable settings 

2 further indicate at least one parameter to use with at least one validation checking 

3 program, wherein the program is further capable of causing the computer to perform: 
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4 using the at least one parameter when executing the validation checking program, 

5 wherein the validation output result further indicates characteristics of the content page 

6 related to the at least one parameter used with the validation checkmg program, 

1 51. The program of claim 49, wherein the accessed customizable settings 

2 further indicate at least one qualifier to use with at least one validation checking program, 

3 wherein the program is further capable of causing the computer to perform: 

4 determining whether the validation output result satisfies the at least one qualifier 

5 provided with the vaUdation checking program producing the output resuh, wherein 

6 metadata for the output result is included in the index if the output result satisfies the 

7 qualifier. 

1 52. The program of claim 5 1, wherein metadata for the content page at the 

2 addressable location is not included in the index if the validation output result does not 

3 satisfy the quaUfier. 

1 53, The program of claim 39, wherein the program is further capable of 

2 causing the computer to perform: 

3 determining a format of the accessed content page; 

4 selecting one of a plurality parsers capable of parsing the determined format; and 

5 parsing the content page using the selected parser, wherein the metadata to add to 

6 the index is generated from the parsed content page. 

1 54, The program of claim 39, wherein the program is further capable of 

2 causing the computer to perform: 

3 determining a parser capable of parsing an embedded file referenced in the content 

4 page; 

5 parsing the content of the referenced embedded file; and 
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6 generating metadata for the parsed content of the embedded file to add to the 

7 index. 

1 55, The program of claim 54, wherein the embedded file is encoded in a 

2 multimedia format. 

1 56, The program of claim 39, further comprising: 

2 distributing the program to content providers capable of accessing and generating 

3 metadata for content provider data repositories using the accessed customizable settings; 

4 and 

5 collecting metadata data gathered from multiple content providers using the 

6 collection tool to gather metadata on their data repositories; 

1 57, The program of claim 56, further comprising: 

2 receiving an electronic subscription firom content providers to use the program to 

3 gather and provide metadata. 
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METHOD, SYSTEM, AND PROGRAM FOR GATHERING 
INDEXABLE METADATA ON CONTENT AT A DATA REPOSITORY 

ABSTRACT 

Disclosed is a method, system, and program for searching a data repository 
5 managed by a content provider to gather indexable metadata on content at addresses 
locations at the data repository. Settings capable of being customized by the content 
provider are accessed The customized settings provide instructions on how to search the 
content provider's data repository. The content of content pages at the content provider's 
data repository is accessed in accordance with instructions included in the accessed 
1 0 customized settings. Metadata from accessed content pages is generated and added to an 
index of metadata for accessed addressable locations at the data repository. 
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Los Angeles, CA 90035 



Direct all telephone calls to David Victor at (310) 553-7977 
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DECLARATION AND POWER OF ATTORNEY FOR 
PATENT APPLICATION 

As a below named inventor, I hereby declare that: 

My residence, post of&e address and citizenship are as stated below next to my name; 

I believe 1 am the original, first and sole inventor (if only one name is listed below) or an 
original, first and joint inventor (if plural names are listed below) of the subject matter which is claimed 
and for which a patent is sought on the invention entitled 

METHOD, SYSTEM, AND PROGRAM FOR GATHERING INDEXABLE METADATA ON 
CONTENT AT A DATA REPOSITORY 

the specification of which (check one) 

JX; is attached hereto, 

was filed on 

as Application Serial No. 

and was amended on 

(if applicable) 

I hereby state that 1 have reviewed and understand the contents of the above identified specification, 
including the claims, as amended by any amendment referred to above. 

1 acknowledge the duly to disclose information which is naaterial to the patentability of this implication in 
accordance with Title 37, Code of Federal Regulations, 1.56. 

I hereby claim foreign priority benefits under Title 35, United States Code, 1 19 of any foreign 
application(s) for patent or inventor's certificate listed below and have also identified below any foreign 
application for patent or inventor*s certificate having a filing date before that of the application on which 
priority is claimed; 

Prior Foreign Application(s): Priority Claimed 

Yes No 

(Number) (Countcy) (Day/Month/Year) 

I hereby claim the benefit under Title 35, United States Code, 1 20 of any United States application(s) 
listed below and, insofar as the subject matter of each of the claims of this application is not disclosed in 
the prior United States application in the manner provided by the first paragraph of Title 35, United 
States Code, 112, 1 acknowledge the duty to disclose information material to the patentability of this 
application as defined in Title 37, Code of Federal Regulations, 1.56 which occurred between the filing 
date of the prior application and the national or PCT international filing date of this application: 



(Application Serial #) (Filing Date) (Status) 
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I hereby declare that all statements made herein of my own knowledge are true and that all statements 
made on infonnation and belief are belicv-cd to be true; and fiarther that these statements were naade with 
the knowledge that willful false statements and the like so made are punishable by fine or imprisonment, 
or both, under Section 1001 of Title 18 of the United States Code and that such willful false statements 
may jeopardize the validity of the application or any patent issued thereon, 

POWER OF ATTORNEY: As a named inventor, 1 hereby appoint the following attorneys and/or agents 
to prosecute this application and transact all business in the Patent and Trademark Office connected 
therewith. 

John W. Henderson, Jr., Reg. No. 26,907; James H. Barksdale, Jr., Reg. No. 24,091; Thomas E. Tyson, 
Reg, No, 28,543; Robert M. Carwcll, Reg. No. 28,499; Jef&ey S. LaBaw, Reg. No. 31,633; Douglas H. 
Lefeve, Reg. No. 26,193; Casimer K. Salys, Reg. No, 28,900; David A. Mims, Jr., Reg. Na 32,70S; 
Richard A. Henkler, Reg. No. 39,220; Anthony V. England, Reg. No. 35,129; Void Emile, Reg. No. 
39,969; Leslie A. Van Leeuwen, Reg. No. 42,196; Christopher A, Hughes, Reg. No. 26,914; Edward A 
Pennington, Reg. No. 32,588; John E. Hoel, Reg. No. 26,279; Joseph C. Redmond, Jr., Reg. No. 18,753; 
Marilyn S. Dawkins, Reg, No. 31,140; David W. Victor, Reg. No, 39,867; William K. Konrad, Reg. No, 
28,868; Alan S, Raynes, Reg. No. 39,809. 

Send correspondence to; 

David Victor, Esq 

1180 South Beverly Dn, Stc, 501 

Los Angeles, CA 90035 

Direct all telq)hone calls to David Victor at (310) 553-7977 
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FULL NAME OF INVENTOR THREE: David Allen Schell 



INVENTORS SIGNATURE: 




DATE: 



RESIDENCE: 1 103 HamUton Way 

Raleigh, North Carolina 27713 

OTIZENSHI?: United States 

POST OFFICE ADDRESS: 1103 Haimlton Way 
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